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ABSTRACT 

Cosmic reionization progresses as H II regions form around sources of ionizing radiation. 
Their average size grows continuously until they percolate and complete reionization. We 
demonstrate how this typical growth can be calculated around the largest, biased sources of 
UV emission such as quasars by further developing an analytical model based on the excursion 
set formalism. This approach allows us to calculate the sizes and growth of the HII regions 
created by the progenitors of any dark matter halo of given mass and redshift with a minimum 
of free parameters. Statistical variations in the size of these pre-existing HII regions are an 
additional source of uncertainty in the determination of very high redshift quasar properties 
from their observed HII region sizes. We use this model to demonstrate that the transmission 
gaps seen in very high redshift quasars can be understood from the radiation of only their 
progenitors and associated clustered small galaxies. The fit sets a lower limit on the redshift 
of overlap of z = 5.8 ± 0.1. This interpretation makes the transmission gaps independent of 
the age of the quasars observed. If this interpretation were correct it would raise the prospects 
of using radio interferometers currently under construction to detect the epoch of reionization. 
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1 INTRODUCTION 

Recent observations are just beginning to reveal the epoch of cos- 
mological reionization, which defines a fundamental transition in 
the universe, separating the cosmic dark ages (e.g. Rees 1997) from 
the epoch of galaxy formation and evolution. The appearance of a 
Gunn-Peterson trough (Gunn & Peterson 1965) in quasar spectra 
indicates that reionization was ending at z ~ 6 (e.g. Becker et al. 
2001; Fan et al. 2002; White et al. 2003), while the large-angle 
polarization anisotropy of the cosmic microwave background ob- 
served by the Wilkinson Microwave Anisotropy Probe (Spergel et 
al. 2006) indicates the universe may have been significantly reion- 
ized by z ~ 10 (Page et al. 2006). Observations of Lyman-a emit- 
ting galaxies at z ~ 5 — 7 are posing puzzles with regard to the 
reionization history at those redshifts (e.g. Haiman 2002; Hu et al. 
2002; Malhotra & Rhoads 2004). 

In order to answer these questions much theoretical effort is 
underway. Numerical studies have provided insights into the asym- 
metric nature of ionization fronts (Abel, Norman & Madau 1999; 
Ciardi et al. 2001; Alvarez, Bromm, & Shapiro 2006a), radiative 
feedback (Ricotti, Gnedin, & Shull 2002; Shapiro, Iliev, & Raga 
2004; Whalen, Abel, & Norman 2004; Kitayama et al. 2004; Abel, 
Wise, & Bryan 2007; Susa & Umemura 2006; Johnson, Greif, 
and Bromm 2006; Ahn & Shapiro 2007), the reionization history 
(Gnedin & Ostriker 1999; Ciardi, Ferrara, & White 2003; Sokasian 
et al. 2004), and the large-scale structure of reionization (Kohler, 
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Gnedin, & Hamilton 2005a; Iliev et al. 2006a; Zahn et al. 2007). 
Because of practical limitations, numerical studies are expensive 
and it is difficult to know which processes to simulate directly 
and which to parameterize. Analytical studies can play a comple- 
mentary role. Early studies modelled reionization by considering 
the growth of H II regions around sources of ionizing radiation in 
a homogeneous expanding universe with a clumping factor (e.g. 
Shapiro & Giroux 1987). While simplistic, models based on this as- 
sumption have proved valuable (e.g. Haiman & Loeb 1997; Madau, 
Haardt, & Rees 1999; Haiman & Holder 2003; Iliev, Scannapieco, 
& Shapiro 2005). Studies that describe the thermodynamics of the 
IGM have added additional affects such as non-equilibrium ion- 
ization and heating and an evolving UV background (e.g. Arons 
& Wingert 1972; Shapiro, Giroux, & Babul 1994; Miralda-Escude 
& Rees 1994; Hui & Gnedin 1997; Miralda-Escude, Haehnelt, & 
Rees 2000; Hui & Haiman 2003; Choudhury & Ferrara 2005). 

In order to understand the large scale structure of reionization, 
analytical models for the sizes H II regions during reionization have 
recently been developed (Wyithe & Loeb 2004b; Furlanetto, Zal- 
darriaga, & Hemquist 2004 - hereafter FZH04; Furlanetto & Oh 
2005; Furlanetto, McQuinn, & Hernquist 2006a; Wyithe & Loeb 
2006; Cohn & Chang 2007). FZH04 found that typical H II re- 
gion sizes during reionization were 1-10 Mpc. These predictions 
were qualitatively verified by the radiative transfer simulations of 
Zahn et al. (2007). While they did not directly compare their sim- 
ulations to the analytical predictions, they did develop a "hybrid" 
technique and showed that it gives strikingly similar results to their 
radiative transfer simulations. The FZH04 model has been used to 
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predict the 21-cm background (McQuinn et al. 2006) as well as the 
kinetic Sunyaev Zel'dovich effect (McQuinn et al. 2005). Furlan- 
etto, Zaldarriaga, & Hernquist (2006b) used it to predict the ef- 
fect of reionization on Lya galaxy surveys, while Dijkstra, Wyithe, 
& Haiman (2007) used it to provide a lower limit to the ionized 
fraction at z = 6.5. Kramer, Haiman, & Oh (2006) extended the 
FZH04 model to include the effects of feedback on the size dis- 
tribution. Alvarez et al. (2006b) used the model to estimate the 
cross-correlation between the cosmic microwave and 21-cm back- 
grounds on large scales. 

Measurement of individual H II region sizes would provide 
strong constraints on the sources responsible for reionization. Until 
now, such detection has remained elusive, due to complexities in 
the interpretation of quasar spectra. These difficulties arise in the 
interpretation of the "transmission gap" between the quasar's Lya 
line and the onset of the Gunn-Peterson trough. This gap has been 
interpreted as corresponding to the quasar's own H II region (e.g., 
Mesinger & Haiman 2004; Wyithe & Loeb 2004a). Alternatively 
the gap may be determined by a combination of the flux from the 
quasar and the background UV radiation field - the Gunn-Peterson 
trough sets in whenever their combined flux cannot keep the IGM 
sufficiently ionized (e.g., Yu & Lu 2005; Fan et al. 2006; Bolton & 
Haehnelt 2007). In this case, the H II region could be much larger 
than the size corresponding to the transmission gap, but note sizes 
near overlap, paying particular attention to the effect of recombina- 
tions on the mean free path of ionizing photons, while Lidz, Oh, & 
Furlanetto (2006), Bolton & Haehnelt (2007), Mesinger & Haiman 
(2007), and Maselli et al. (2007) examined the effects on quasar 
spectra due to density fluctuations in the IGM. 

Here we will model the size of pre-existing HII regions that 
existed around high redshfit quasars before they began to shine. 
We will use the conditional H II region size distribution, which 
describes statistically the size distribution of H II regions that sur- 
round haloes of a given mass. In §2 we review the model and derive 
the distribution. In §3 we discuss the implications for quasar H II 
regions, and conclude with a discussion in §4. We adopt parameters 
based on WMAP 3-year observations (Spergel et al. 2006), (Q m h 2 , 
Q b h 2 , h, n s , cr 8 )=(0.13,0.022,0.73,0.95,0.74). 



2 CONDITIONAL H II REGION SIZES 

In the model of FZH04, the size of the H II region in which a point 
lies is determined by finding the largest spherical region centered 
on the point for which the mean collapsed fraction / co ii > C~ • 
where ( is an efficiency parameter which describes how many ion- 
izing photons are produced per collapsed atom. By using the ex- 
tended Press-Schechter formalism, this condition can be expressed 
in terms of the mean overdensity of the region, <5 m , by 
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where m is the mass of the region, a 2 (m) is the variance of density 
fluctuations, S c (z) is the threshold overdensity for collapse, and 
<r min = CT 2 (m m i n ) is the variance on the scale of the minimum 
halo mass which contributes to reionization, m m i n . By finding a 
linear approximation to the "barrier", 
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Additional details can be found in FZH04. 

What is the probability, f(M b \M)dM b , that a halo of mass 
M will be located in an ionized bubble with a size between M b 
and M b + dM b 7 This can be found by considering the halo to 
be the locus of the first upcrossing of a random walk in 8. First, 
we determine the conditional probability, f(S,S c \S b ,S b )dS, that 
a point which first crossed the barrier at S b = a 2 (M b ) and 
5 b EE B(M b , z), crosses the halo barrier S c between S = a 2 (Mh) 
and S + dS. As has been discussed previously (e.g. Sheth 1998), 
this can be accomplished by using the standard expression for the 
distribution of up-crossings of a linear barrier for trajectories start- 
ing from 5 = and 5 — 0, 
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but moving the origin to (Sb, S b ) (see also Furlanetto, McQuinn, & 
Hernquist 2006a, equation 2), 
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The conditional probability that the first up-crossing of the bubble 
barrier occured between Sb and Sb + dS b , given that it crosses the 
halo barrier S c at S > S b , is 

f(S b ,5 b \S,5 c )dS b = f }?*'* b S'!S f(S, 5 c \S b , 5 b )dS b . (7) 

Thus, the probability that a halo of mass M is inside an ionized 
bubble of mass between M b and M b + dM b is 



f(M b \M)dM b = f(S b ,S b \S,S c )^-dMb 
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Shown in Fig. Q] are the median, 68%, and 95% contours of the 
distribution f(Mb\Mh), for different halo masses Mh, as well as 
the "global" relationship given by equation ©. The figure shows 
comoving radius rather than mass, defined by the relation R = 

[3M/(4^po)] 1 / 3 . 



3 QUASAR H II REGIONS 

In this section we will discuss the implications of the conditional 
H II region size distribution on observations of quasar H II regions. 
Before proceeding, we briefly describe what we mean by quasar 
H II region, especially near the end of reionization. 

Late in reionization, close to percolation, the H II region sizes 
grow rapidly. This is of course the expected behaviour, but it is not 
clear what meaning to attach to any given H II region when most 
of the universe is already ionized. For H II regions around very rare 
quasar host halos, it is reasonable to expect that the central H II re- 
gion in which the quasar forms is much larger than the surrounding 
H II regions. While the mean global ionized fraction may be high, 
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Figure 1. Shown is the evolution of the median bubble size (solid), along 
with the 68% (dark shaded region) and 95% (light shaded region) contours 
of the distribution. The "Global" panel is for the unconditional mean dis- 
tribution, while the other panels are the conditional distributions given that 
the H II regions surround halos of total mass Afh, as labelled. We assumed 
a minimum source halo virial temperature of 10 4 K. 

the fluctuations in ionized fraction outside of the central H II region 
are likely to be on much smaller scales. 



3.1 Interpretation of observed H II region sizes 

The variation in the size of pre-existing H II regions that surround 
quasars when they begin to shine manifests itself in a theoretical 
uncertainty in the determination of properties of quasars, such as 
their ionizing photon luminosity and age, as well as properties of 
the surrounding medium, such as the neutral fraction and gas den- 
sity. For a quasar that has been on for a time t and has an ionizing 
photon luminosity 7V 7 , the observed line-of-sight radius of its H II 
region, R b s , is related to the radius of the H II region in which it 
initially lies, Rb, by 
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where Vb = iwR^, hB /3 is the volume enclosed by the observed 
H II region, uh is the hydrogen number density, x is the ionized 
fraction, and we have assumed that the apparent age of the quasar, 
t, is less than the recombination time, t rcc = 1.9 Gyr Gf (1 + 
+ A wnere Ci is the clumping factor. Because the 
H II region size is measured along the line of sight, this equation is 
correct even when the ionization front velocity is relativistic (e.g. 
White et al. 2003; Shapiro et al. 2006). The quantity V is the vol- 
ume that is actually ionized by the quasar itself, excluding the vol- 
ume already ionized by existing nearby sources, 

What is the uncertainty in the determination of V, given that 
the initial radius of the H II region, i?b, is unknown? We express 
this uncertainty in terms of an error 
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which is the fractional amount by which the uniform-IGM estimate 
for the volume ionized by the quasar, Vb, exceeds the actual one, 
V. We obtain 
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Figure 2. Distribution of error in volume ionized by the quasar, A crr (equa- 
tion llOt . for a given observed comoving H II region radius, R D b s = 20 Mpc 
(dotted), 40 Mpc (dashed), and 60 Mpc (long dashed), for a central halo 
mass Mh = 10 12 Mq , a minimum source halo virial temperature of 10 4 K, 
and and efficiency f = 15. Thus upper panels are the cumulative distribu- 
tion, while the lower panels are the differential distribution. 
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where P is the cumulative probability, e.g. P(< Mb\Mb) = 
P(< R b \M h ) = J Mb f(M\M h )dM, f(M\M h ) is the condi- 
tional bubble size distribution as defined in equation (8), Mb = 
47rpoflbA and M h = 1O 12 M . The error distribution is shown 
in Fig. [2] for different values of -R b s - For larger observed H II re- 
gions, the overestimate of V is smaller. If the ionizing luminosity 
and age of the quasar are known, as well as the density of the sur- 
rounding gas, then the neutral fraction can be determined according 
to 



(12) 



Of course, the luminosity, age, and density are not known ex- 
actly, but arguments can be made based upon their likely values 
(e.g. Wyithe & Loeb 2004a). The error in V is related to an error 
in xhi by xm/xm,o = 1 + A crr , where xm,o is the neutral frac- 
tion inferred by assuming the quasar began to shine in a completely 
neutral medium. Since A crr > 0, the neutral fraction is always un- 
derestimated when the presence of existing H II regions is not taken 
into account. There are therefore two reasons why higher neutral 
fractions lead to smaller H II regions. First, higher neutral fractions 
lead to slower ionization fronts propagating away from the quasar 
and thus to smaller H II regions. Second, the higher the neutral frac- 
tion, the smaller the pre-existing H II region, which also leads to a 
smaller observed H II region. 



3.2 Proximity zones in quasar spectra 

Fan et al. (2006) define "proximity zones" as regions where the 
transmission is greater than ten percent, but do not explicitly as- 
sociate them with the size of the quasar's H II region. These 
proximity zones have radii which increase steeply over the range 
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6.4 > z > 5.8, from R p ~ 30 to i? p ~ 80 comoving Mpc 
(Fig.[5J . These values are similar to the size of H II regions around 
lO 12 M0-halos (a likely minimum value for host halo masses of ob- 
served quasars at z ~ 6; e.g. Volonteri & Rees 2006; Li et al. 2006) 
near the end of reionization (Fig.QJ. 

Shown also in Fig. [3] is the typical size of H II regions sur- 
rounding halos with masses Afh = 1O 12 M0 for a model in which 
the ionized fraction reaches unity at z ov = 5.8 and the integrated 
Thomson scattering optical depth (assuming once ionized helium) 
is T cs — 0.055. The theoretical expectation for the evolution H II 
region sizes agrees quite well with the measured proximity zone 
sizes. This suggests that the evolving sizes of the proximity zones 
measured by Fan et al. (2006) can be explained by the growth of 
cosmic H II regions driven by clustered sources around the quasars 
alone. Taking into account the contribution from the quasars could 
change the theoretical prediction, since the flux contributed by the 
quasar at distances greater than the size of the pre-existing bubble 
may be large enough to increase transmission there. For a quasar 
with a spectral shape L v oc v (e.g., Bolton & Haenelt 2007) 
and ionizing photon luminosity N, ionization equilibrium implies 
that the neutral fraction at a comoving distance R is 
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while the neutral fraction necessary to obtain a Gunn-Peterson op- 
tical depth tgp is given by (Fan et al. 2006), 
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For reasonable IGM clumping factors and quasar ionizing photon 
luminosities, therefore, the quasar by itself is able to cause trans- 
mission at a distance of 40 comoving Mpc, very similar to the size 
of the transmission gaps of the highest redshift quasars measured 
by Fan et al. (2006). Thus, there are two possible explanations for 
the rapid increase in the size of the trasnmission gaps: 1) HII re- 
gions, defined as regions where hydrogen is mostly ionized, are 
much larger than the observed transmission gaps at z < 6.4. The 
transmission gaps correspond to smaller regions, within which the 
evolving UV intensity from a combination of nearby galaxies and 
the quasar ionizes the IGM to a level sufficient to allow transmis- 
sion, or 2) The transmission gaps correspond to the extent of the 
HII regions themselves, which grow rapidly as overlap takes place 
at z ~ 6. The UV background intensity within these HII regions 
is dominated by galaxies clustered around the central quasar and is 
sufficient to cause the observed transmission. 



4 DISCUSSION 

We have used the conditional H II region size distribution around 
halos with a mass ~ 10 12 Mq to model observations of quasar H II 
regions. Our results can be summarized as follows: 

1. Due to the biased location of quasars, the H II regions that 
surround them just before they begin to emit ionizing radiation are 
likely to be large, with radii of order tens of Mpc, even when the 
mean ionized fraction is of order thirty percent. 

2. For observed quasar H II regions with sizes of order tens of 
Mpc, it is difficult to determine the properties of the surrounding 
medium, due to uncertainties in the size of pre-existing H II re- 
gions - neutral fractions will be underestimated for a given quasar 
luminosity and lifetime. This effect is strongest for small observed 
H II regions around quasars late in the reionization epoch. 
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Figure 3. Median (solid) and 68% contours (dashed) of the size of H II 
regions surrounding a 1O 12 M0 halo vs. redshift for a model with £ = 13.5 
and z ov = 5.8. Also shown are the "proximity zone" radii (triangles) of 
sixteen quasars as determined by Fan et al. (2006). The dotted lines are the 
ionized and neutral fractions multiplied by 100. Shown also is a linear fit 
(long dashed) of the proper size vs. redshift, as in Fan et al. (2006). 



3. The observed transmission gaps around high-redshift quasars 
may have a direct correspondence to cosmic H II regions created 
by galaxies clustered around the central quasar. The steep increase 
in the transmission gap size at z < 6.4 can be explained by the 
rapid growth of these H II regions at the epoch of overlap. 

We expect conclusion 1 to be generally true, but conclusions 2 and 
3 require further explanation. 

For conclusion 2, the situation is more complicated when 
quasar H II region sizes are obtained from transmission gaps in 
their spectra. In this case, effects such as a boost in the neutral hy- 
drogen abundance due to clumping and Lya absorption from the 
damping wing of neutral gas outside the H II region can cause the 
inferred H II region size to be smaller than it actually is (Bolton 
& Haehnelt 2007; Maselli et al. 2007). This underestimate leads to 
an overestimate of neutral hydrogen abundance for a given quasar 
luminosity and lifetime. Neglecting the H II region created by the 
nearby galaxies and their progenitors, on the other hand, leads to an 
underestimate of the neutral fraction. A natural question is which 
of these two competing effects is stronger. For example, Maselli 
et al. (2007) find that the physical radius corresponding to the trans- 
mission gap is about 40 percent larger than that of the H II region, 
which corresponds to an underestimate of its volume by a factor of 
3, and an overestimate of the neutral fraction by that same factor. In 
order to offset this effect, A crr > 2 would be necessary. For exam- 
ple, the probability that a quasar surrounded by a 40 comoving Mpc 
H II region will have A Grr > 2 is about 30 percent, for x — 0.9 
at z — 6.15 with T m i n = 10 4 K, as shown in Fig. 2. At earlier 
times and lower ionization fraction, x — 0.75 and z — 6.5, the 
probability is only ten percent. The two competing effects are thus 
similar in magnitude when the ionized fraction is high, x ~ 0.9, but 
the "apparent shrinking" reported by Maselli et al. (2007) is likely 
to dominate at earlier times, when x < 0.5. These estimates are 
quite uncertain, however - rarer sources and/or quasar hosts could 
make these probabilities higher by increasing the pre-existing bub- 
ble size, and more detailed calculations will be necessary to refine 
these estimates further. 
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For conclusion 3 to be valid, it is necessary for the UV ra- 
diation field within ionized bubbles to keep the volume-averaged 
neutral fraction low enough to create the observed transmission 
gaps. This is plausible, since the UV radiation field near quasars 
is likely enhanced due to their biased environment (e.g., Yu & Lu 
2005), with a size approaching that of the mean free path imposed 
by Lyman-limit systems (~ 50 comoving Mpc at z ~ 6; Gnedin 
& Fan 2006). 

There have been recent suggestions that detection of large 
(~ 50 comoving Mpc) H II regions by 21-cm tomography can 
reveal the locations of quasars and bright galaxies that just under- 
went an AGN phase (Wyithe, Loeb & Barnes 2005; Kohler et al. 
2005b). Our results here suggest that large, H II regions may be 
visible at relatively low redshifts, z ~ 7, where foreground con- 
tamination of the 21-cm observations is lowest. Thus, the 21-cm 
observations hold the most promise for descriminating between the 
two possible explantations for the evolution of transmission gap 
sizes discussed here. In addition, measurement of the sizes of H II 
regions around bright quasars by 21-cm tomography, when com- 
bined with the conditional H II region size distribution, will provide 
powerful constraints on the theory of reionization and the nature of 
high-redshift quasars. 

In our model the global ionized fraction is proportional to the 
collapsed fraction, so that the same ionizing efficiency is assigned 
to collapsed matter, regardless of halo mass, epoch, or environment. 
In such a model, the ionized fraction grows exponentially with 
time. The reionization history is likely to be more complex than 
this, especially in light of the optical depth measured by WMAP. 
For example, self-regulated reionization, in which the lowest mass 
objects do not produce ionizing photons when they form within 
existing H II regions, can extend the reionization epoch, relieving 
the tension between a percolation at z ~ 6 and the WMAP value 
of r cs ~ 0.09 (Haiman & Bryan 2006; Iliev et al. 2006b). We 
note, however, that our best-fitting model to the evolution of the 
transmission gap sizes, in which overlap is complete at z ~ 5.8 
and T es ^ 0.055, is in marginal agreement with the value obtained 
from WMAP 3 year polarization data, r os = 0.088to;°3l (Spergel 
et al. 2006). This indicates that the level of early reionization de- 
manded by the WMAP measurement may be quite modest, while 
still allowing for a late overlap at z = 5.8. 

The fact that H II regions are likely to surround quasars before 
they begin to shine, and that their size will have a strong redshift 
dependence, are likely to be crucial in developing a more complete 
understanding of observations of the high redshift universe. 
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